34 research outputs found

    Sistema de toma de decisiones basado en emociones y autoaprendizaje para agentes sociales autónomos

    Get PDF
    El objetivo de esta tesis es desarrollar un sistema de toma de decisiones para un robot personal social y autónomo. Este sistema consiste en varios módulos: el módulo motivacional, el módulo de los drives y el módulo de valoración y selección de comportamientos. Estos módulos están basados en motivaciones, drives y emociones, conceptos que son ampliamente desarrollados en esta tesis. Debido a las dificultades de trabajar con un robot real, como primer paso en esta investigación, se ha optado por la implementación previa de esta arquitectura en agentes virtuales. Dichos agentes viven en un mundo virtual que ha sido construido utilizando un juego de rol basado en texto del tipo MUD (Multi User Domain). En este mundo los agentes interaccionan entre sí y con diferentes objetos. Se ha elegido este tipo de juegos basados en texto en lugar de otros más modernos con interfaces gráficas, porque la interpretación de la información es mucho más sencilla. La selección de los comportamientos es aprendida por el agente mediante la utilización de algoritmos de aprendizaje por refuerzo. Cuando el agente no interactúa con otros agentes utiliza el algoritmo Q-learning. Cuando existe interacción social el refuerzo que recibe el agente es debido a la acción conjunta con el otro agente y, en este caso, hace uso de algoritmos de aprendizaje multi-agente, también basados en Q-learning. El estado interno del agente se considera como la motivación dominante, lo que provoca que el sistema no sea completamente markoviano y por lo tanto no tan sencillo de tratar. Con el fin de simplificar el proceso de aprendizaje, los estados relacionados con los objetos del mundo se consideran como independientes entre sí. El estado del agente es una combinación de su estado interno y su estado en relación con el resto de agentes y objetos. El hecho de considerar los estados en relación a los objetos como independientes, hace que no se tenga en cuenta la posible relación entre ellos. En realidad, las acciones que se realizan con un objeto pueden llegar a afectar al estado en relación con los otros objetos, provocando “efectos colaterales”. En esta tesis se propone una variación original del algoritmo Q-learning que considera dichos efectos. xii En este sistema se utilizan la felicidad y la tristeza como refuerzo positivo y negativo respectivamente. Por lo tanto, los comportamientos no son seleccionados para satisfacer los objetivos determinados por las motivaciones del agente, sino para alcanzar la felicidad y evitar la tristeza. Las teorías de valoración de las emociones afirman que éstas son el resultado de procesos evaluadores y por lo tanto subjetivos. En base a estas teorías se considera, en este sistema de toma de decisiones, que ciertas emociones son generadas a partir de la valoración del bienestar del agente. Este bienestar mide el grado de satisfacción de las necesidades del agente. De esta forma se asume que la felicidad se produce cuando le sucede algo bueno al agente, aumentando el bienestar. La tristeza, por el contrario, se produce cuando pasa algo malo, provocando una disminución del bienestar. Finalmente, se introduce la emoción del miedo desde dos puntos de vista: tener miedo a realizar acciones arriesgadas y tener miedo a estar en estados peligrosos. Para ello se considera el miedo como otra motivación del agente, lo cual también coincide con otras teorías de emociones ____________________________________________The objective of this thesis is to develop a decision making system for an autonomous and social robot. This system is composed by several subsystems: a motivational system, a drives system and an evaluation and behaviour selection system. All systems are based on motivations, drives and emotions. These concepts are described in detail in this thesis. Due to the difficulties of working with a real robot, it has been decided to implement this decision making system on virtual agents as a previous step. These agents live in a virtual world which has been built using a text based MUD (Multi-User Domain). In this world the agents can interact with each other, allowing the social interaction, and with the other objects present in the world. The reason why this text based game was selected, instead of a modern one with graphic interfaces, is that the interpretation of the information is much simpler. The selection of behaviours is learned by the agent using reinforcement learning algorithms. When the agent is not interacting with other agents he uses the Q-learning algorithm. When social interaction exists, the rewards the agent receives depend not only on his own actions, but also on the action of the other agent. In this case, the agent uses multi-agent learning algorithms, also based on Q-learning. The inner state of the agent is considered as the dominant motivation. This fact causes that the system is not completely markovian and therefore, no so easy to work with. In order to simplify the learning process, the states related to the objects are considered as independent among them. The state of the agent is a combination between his inner state and his state in relation with the rest of agents and objects. The fact that the states in relation to the objects are considered as independent, causes that the possible relation between objects is ignored. In fact, the actions related to an object, may affect the state in relation to the other objects, causing “collateral effects”. In this thesis, a new variation of Q-learning is proposed to consider these effects. xiv This system uses happiness and sadness as positive and negative reinforcement respectively. Therefore, behaviours are not going to be selected to satisfy the goals determined by the motivations of the agent, but to reach happiness and avoid sadness. The appraisal emotions theories state that emotions are the result of evaluation processes and therefore they are subjective. Based on those theories, it is going to be considered, in this decision making system, that certain emotions are going to be generated from the evaluation of the wellbeing of the agent. The wellbeing measures how much the needs of the agent are satisfied. Happiness is produced because something good has happened, i.e. an increment of the wellbeing is produced. On the contrary, sadness is produced because something bad has happened, so the wellbeing decreases. Finally, another emotion is introduced: Fear. Fear is presented from two points of view: to be afraid of executing risky actions, or to be afraid of being in a dangerous state. In this last case, fear is considered as a motivation, in accordance with other emotions theorie

    Learning to Avoid Risky Actions

    Get PDF
    When a reinforcement learning agent executes actions that can cause frequent damage to itself, it can learn, by using Q-learning, that these actions must not be executed again. However, there are other actions that do not cause damage frequently but only once in a while, for example, risky actions such as parachuting. These actions may imply punishment to the agent and, depending on its personality, it would be better to avoid them. Nevertheless, using the standard Q-learning algorithm, the agent is not able to learn to avoid them, because the result of these actions can be positive on average. In this article, an additional mechanism of Q-learning, inspired by the emotion of fear, is introduced in order to deal with those risky actions by considering the worst results. Moreover, there is a daring factor for adjusting the consideration of the risk. This mechanism is implemented on an autonomous agent living in a virtual environment. The results present the performance of the agent with different daring degrees.The funds provided by the Spanish Government through the project called “A New Approach to Social Robotics” (AROS), of MICINN (Ministry of Science and Innovation) and through the RoboCity2030-IICM project (S2009/DPI-1559), funded by Programas de Actividades I+D en la Comunidad de Madrid and cofunded by Structural Funds of the EU

    Fast 3D cluster tracking for a mobile robot using 2D techniques on depth images

    Get PDF
    User simultaneous detection and tracking is an issue at the core of human-robot interaction (HRI). Several methods exist and give good results; many use image processing techniques on images provided by the camera. The increasing presence in mobile robots of range-imaging cameras (such as structured light devices as Microsoft Kinects) allows us to develop image processing on depth maps. In this article, a fast and lightweight algorithm is presented for the detection and tracking of 3D clusters thanks to classic 2D techniques such as edge detection and connected components applied to the depth maps. The recognition of clusters is made using their 2D shape. An algorithm for the compression of depth maps has been specifically developed, allowing the distribution of the whole processing among several computers. The algorithm is then applied to a mobile robot for chasing an object selected by the user. The algorithm is coupled with laser-based tracking to make up for the narrow field of view of the range-imaging camera. The workload created by the method is light enough to enable its use even with processors with limited capabilities. Extensive experimental results are given for verifying the usefulness of the proposed method.Spanish MICINN (Ministry of Science and Innovation) through the project ‘‘Applications of Social Robots=Aplicaciones de los Robots Sociales.’’Publicad

    Application of the fast marching method for outdoor motion planning in robotics

    Get PDF
    In this paper, a new path planning method for robots used in outdoor environments is presented. The proposed method applies Fast Marching to a 3D surface represented by a triangular mesh to calculate a smooth trajectory from one point to another. The method uses a triangular mesh instead of a square one since this kind of grid adapts better to 3D surfaces. The novelty of this approach is that, before running the algorithm, the method calculates a weight matrix W based on the information extracted from the 3D surface characteristics. In the presented experiments these features are the height, the spherical variance, and the gradient of the surface. This matrix can be viewed as a difficulty map situated over the 3D surface and is used to limit the propagation speed of the Fast Marching wave in order to find the best path depending on the task requirements, e.g., the least energy consumption path, the fastest path, or the most plain terrain. The algorithm also gives the speed for the robot, which depends on the wave front propagation speed. The results presented in this paper show how, by varying this matrix W, the paths obtained are different. Moreover, as it is shown in the experimental part, this algorithm is also useful for calculating paths for climbing robots in much more complex environments. Finally, at the end of the paper, it is shown that this algorithm can also be used for robot avoidance when two robots approach each other, and they know each other's position.Comunidad de Madri

    Learning the selection of actions for an autonomous social robot by reinforcement learning based on motivations

    Get PDF
    Autonomy is a prime issue on robotics field and it is closely related to decision making. Last researches on decision making for social robots are focused on biologically inspired mechanisms for taking decisions. Following this approach, we propose a motivational system for decision making, using internal (drives) and external stimuli for learning to choose the right action. Actions are selected from a finite set of skills in order to keep robot's needs within an acceptable range. The robot uses reinforcement learning in order to calculate the suitability of every action in each state. The state of the robot is determined by the dominant motivation and its relation to the objects presents in its environment. The used reinforcement learning method exploits a new algorithm called Object Q-Learning. The proposed reduction of the state space and the new algorithm considering the collateral effects (relationship between different objects) results in a suitable algorithm to be applied to robots living in real environments. In this paper, a first implementation of the decision making system and the learning process is implemented on a social robot showing an improvement in robot's performance. The quality of its performance will be determined by observing the evolution of the robot's wellbeing.The funds provided by the Spanish Government through the project called “Peer to Peer Robot-Human Interaction” (R2H), of MEC (Ministry of Science and Education), the project “A new approach to social robotics” (AROS), of MICINN (Ministry of Science and Innovation), and the RoboCity2030-II-CM project (S2009/DPI-1559), funded by Programas de Actividades I+D en la Comunidad de Madrid and cofunded by Structural Funds of the EU

    An autonomous social robot in fear

    Get PDF
    Currently artificial emotions are being extensively used in robots. Most of these implementations are employed to display affective states. Nevertheless, their use to drive the robot's behavior is not so common. This is the approach followed by the authors in this work. In this research, emotions are not treated in general but individually. Several emotions have been implemented in a real robot, but in this paper, authors focus on the use of the emotion of fear as an adaptive mechanism to avoid dangerous situations. In fact, fear is used as a motivation which guides the behavior during specific circumstances. Appraisal of fear is one of the cornerstones of this work. A novel mechanism learns to identify the harmful circumstances which cause damage to the robot. Hence, these circumstances elicit the fear emotion and are known as fear releasers. In order to prove the advantages of considering fear in our decision making system, the robot's performance with and without fear are compared and the behaviors are analyzed. The robot's behaviors exhibited in relation to fear are natural, i.e., the same kind of behaviors can be observed on animals. Moreover, they have not been preprogrammed, but learned by real interactions in the real world. All these ideas have been implemented in a real robot living in a laboratory and interacting with several items and people.The funds have been provided by the Spanish Government through the project called "A new approach to social robotics" (AROS), of MICINN (Ministry of Science and Innovation) and through the RoboCity2030- II-CM project (S2009/DPI-1559), funded by Programas de Actividades I+D en la Comunidad de Madrid and cofunded by Structural Funds of the EU

    Signage System for the Navigation of Autonomous Robots in Indoor Environments

    Get PDF
    In many occasions people need to go to certain places without having any prior knowledge about the environment. This situation may occur when the place is visited for the first time, or even when there is not any available map to situate us. In those cases, the signs of the environment are essential for achieving the goal. The same situation may happen for an autonomous robot. This kind of robots must be capable of solving this problem in a natural way. In order to do that, they must use the resources present in their environment. This paper presents a RFID-based signage system, which has been developed to guide and give important information to an autonomous robot. The system has been implemented in a real indoor environment and it has been successfully proved in the autonomous and social robot Maggie. At the end of the paper some experimental results, carried out inside our university building, are presented.Comunidad de Madri

    Toma de decisiones en robótica

    Get PDF
    En este artículo se presenta, en forma de tutorial, una visión de conjunto de la situación actual del problema de la toma de decisiones en robótica. El estudio se plantea de forma amplia e integradora y, por tanto, intentando no entrar en detallar soluciones concretas para problemas también concretos. El artículo está centrado sobre todo en las decisiones de alto nivel que debe tomar un robot, y no en problemas de más bajo nivel, que se solucionan empleando técnicas tradicionales de control o mediante algoritmos muy específicos. Nos referimos a "toma de decisiones" de un robot en el sentido amplio de determinar las actividades a realizar por el robot. Es decir, sin hacer ninguna exclusión a priori, basada en cuestiones tales como la estrategia de toma de decisiones empleada, el tipo de robot, las tareas que puede realizar, etc. El artículo está estructurado en una serie de secciones, en las que se tratan diversos temas de interés en robótica, desde la perspectiva de la toma de decisiones: autonomía, inteligencia, objetivos, decisiones de alto nivel, estrategias de toma de decisiones, arquitecturas de control, percepción, interacción humano-robot, aprendizaje y emociones.Este trabajo ha sido realizado parcialmente gracias al apoyo económico del Gobierno de España a través del proyecto “Interacción igual a igual humano-robot (R2H)”, del Ministerio de Educación y Ciencia, y del proyecto “Una nueva aproximación a los robots sociales (AROS)”, del Ministerio de Ciencia e Innovación. Este trabajo ha sido financiado por la Comnidad de Madrid ( S2009/DPI-1559/ROBOCITY2030 II), desarrollado por el Laboratorio de Robótica (Robotics Lab)de laUniversidad Carlos III de MadridPublicad

    A motivational model based on artificial biological functions for the intelligent decision-making of social robots

    Get PDF
    Modelling the biology behind animal behaviour has attracted great interest in recent years. Nevertheless, neuroscience and artificial intelligence face the challenge of representing and emulating animal behaviour in robots. Consequently, this paper presents a biologically inspired motivational model to control the biological functions of autonomous robots that interact with and emulate human behaviour. The model is intended to produce fully autonomous, natural, and behaviour that can adapt to both familiar and unexpected situations in human–robot interactions. The primary contribution of this paper is to present novel methods for modelling the robot’s internal state to generate deliberative and reactive behaviour, how it perceives and evaluates the stimuli from the environment, and the role of emotional responses. Our architecture emulates essential animal biological functions such as neuroendocrine responses, circadian and ultradian rhythms, motivation, and affection, to generate biologically inspired behaviour in social robots. Neuroendocrinal substances control biological functions such as sleep, wakefulness, and emotion. Deficits in these processes regulate the robot’s motivational and affective states, significantly influencing the robot’s decision-making and, therefore, its behaviour. We evaluated the model by observing the long-term behaviour of the social robot Mini while interacting with people. The experiment assessed how the robot’s behaviour varied and evolved depending on its internal variables and external situations, adapting to different conditions. The outcomes show that an autonomous robot with appropriate decision-making can cope with its internal deficits and unexpected situations, controlling its sleep–wake cycle, social behaviour, affective states, and stress, when acting in human–robot interactions.The research leading to these results has received funding from the projects: Robots Sociales para Estimulación Física, Cognitiva y Afectiva de Mayores (ROSES), RTI2018-096338-B-I00, funded by the Ministerio de Ciencia, Innovación y Universidades; Robots sociales para mitigar la soledad y el aislamiento en mayores (SOROLI), PID2021-123941OA-I00, funded by Agencia Estatal de Investigación (AEI), Spanish Ministerio de Ciencia e Innovación. This publication is part of the R&D&I project PLEC2021-007819 funded by MCIN/AEI/10.13039/501100011033 and by the European Union NextGenerationEU/PRTR

    Applications and Trends in Social Robotics

    Get PDF
    The study has received funding from two projects: Development of social robots to help seniors with cognitive impairment (ROBSEN), financed by the Spanish Ministry of Economy; and RoboCity2030-IIICM, funded by the Comunidad de Madrid and co-financed by the European Union Structural Funds
    corecore